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CHIMERIC MULTIVALENT PROTEIN ANALOGUES AND 
METHODS OF USE THEREOF 



Description 

5 Background of the Invention 

The Immunoglobulin (Ig) Gene Super family is 
comprised of numerous cell surface and soluble 
molecules that mediate recognition, adhesion or binding 
.functions in vertebrates . (Abbas, A.K. et al . . CELLULAR 

10 AND MOLECULAR IMMUNOLOGY, p. 144 (1991)). Members of 
the Ig Superfamily have an evolutionary relationship 
and share significant amino acid sequence and 
structural similarities. (Williams, A. F. and Barclay, 
A. N. , IMMUNOGLOBULIN GENES, p. 372 (1989)). Two 

15 criteria for membership within the family are: l) 

sequence homology with Ig or Ig-related polypeptide 
domains, which are approximately 70-110 amino acid 
residues long, and 2) key structural features which 
include the polypeptide domains comprised of a sandwich 

20 arrangement of two jS-sheets, each made up of four or 

five anti-parallel 0-strands of five to ten amino acid 
residues. (Abbas, A.K. , et al. . CELLULAR AND MOLECULAR 
IMMUNOLOGY, pp. 144-145 (1991)). 

The Ig Superfamily domains are classified as 

25 either variable (V) or constant (C) based on 

characteristics of the 0-strands within the j3-sheet 
sandwich. (Abbas, A. K., et al . , CELLULAR AND MOLECULAR 
IMMUNOLOGY, p. 144 (1991)). For example, in one class 
of Superfamily molecules, the immunoglobulins, V 



VO 93/23537 



PCT/US93/04338 



domains are at the amino-terminal ends of separate 
"heavy" (H) and "light" (L) chains, succeeded in the 
polypeptide chains by constant (C) domains . Thus, in 
an immunoglobulin, a V domain is defined as either V H 
5 or V L (Figure 1) . In the T cell receptor, V and C 
refer to Ig-like variable and constant domains which 
are comprised of polypeptide a and 0 chains. In other 
Super family molecules, V and C domains may be comprised 
of y, S r or e chains. 
10 In the Ig Superfamily, the polypeptide chains that 

comprise the V regions, associate to form ligand 
binding sites. For example, in the immunoglobulin 
molecule, the V H and V L domains associate to form the 
variable fragment (Fv) region which comprises the 
15 1 antibody binding site. The Fv region includes both 

scaffold-like regions, termed framework regions (FRs) , 
and regions of hyper-variability, termed 
complementarity-determining regions (CDRs) . It is the 
CDRs that contribute to the unique antigen specificity 
20 of immunoglobulins. (Abbas, A. K. , et al . , CELLULAR AND 
MOLECULAR IMMUNOLOGY , p. 45 (1991)) . Under special 
circumstances , the Fv region has been proteolytically 
dissected from its parent Ig to yield a variable-region 
fragment (Fv fragment) that is comprised of two non- 
25 covalently associated domains (V H »V L , a heterodimer) . 
This heterodimeric Fv fragment can further dissociate 
into single V. L and V H domains. (Huston, J.S. et al. 
Meth. Enzvmol. 203:46-88 (1991)). 

Recently, protein engineering methods have been 
30 used to link V H and V L chains, creating a functional 
single-chain Fv (sFv) , which has a solitary antibody 
binding site that does not dissociate into single 
domains at low concentrations. (Huston, J.S. , et al. , 
Proc. Natl. Acad. Sci. USA 85:5879-5883 (August 1988)). 
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In this approach, the genes encoding V H and V L domains 
of a given antibody are connected at the DNA level by 
an appropriate nucleotide sequence, and on translation, 
this gene forms a single polypeptide chain with a 
5 peptide linker bridging the two variable domains. 
(Huston, J.S., et al. . Meth. Enzvmol. , 203:46-88 
(1991)). 

Summary of the Invention 

The present invention relates to a chimeric 

10 immunoglobulin (Ig) Superfamily protein analogue having 
more than one biologically active binding site. 
Hereinafter, the term multivalent will be used to 
describe these multiple binding sites. The chimeric 
multivalent Ig Superfamily protein analogue, 

15 hereinafter referred to as a CHI-protein, or x~protein, 
is comprised of one or more polypeptide chains forming 
a /?-barrel domain. A single /3-barrel domain may 
comprise a chimeric protein binding domain with more 
than one binding site. Alternatively, more than one 0- 

20 barrel domain, such as the V L and V H 0-barrel domains 
in immunoglobulins, may combine to form a larger 
concentric 0-barrel domain having more than one binding 
site. 

The 0-barrel domain (s) comprising the binding 
25 regions has amino acid sequence and structural homology 
with variable regions of molecules related to the Ig 
Superfamily of molecules. Specifically, the binding 
sites on the x~P r °tein are comprised of hypervariable 
regions derived from molecules related to the Ig 
30 Superfamily of molecules. The Ig Superfamily includes 
immunoglobulins, cell surface antigens, such as T 
lymphocyte antigens, and cell surface receptors, such 
as immunoglobulin Fc receptors. 
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In a preferred embodiment, the hypervariable 
regions are complementarity-determining regions (CDRs) 
derived from the antigen binding sites of 
immunoglobulins. In this embodiment, the multivalent 
5 protein analogue comprises one or more- polypeptide 
chains forming a 0-barrel domain containing CDRs 
interspersed between framework regions (FRs) . These 
CDRs define one antigen binding site. This multivalent 
protein analogue also has one or more additional 

10 antigen binding sites spliced into the FRs of the 0- 
barrel domain. 

In one embodiment, the x-protein will comprise a 
single polypeptide chain forming a j8-barrel domain. 
In another embodiment the x~protein will comprise a 

15 * single polypeptide chain comprised of two polypeptide 
chains, connected by a polypeptide linker spanning the 
distance between the C-terminus of one chain to the N- 
terminus of the other chain forming a ^-barrel domain. 
In yet another embodiment, the x~protein will comprise 

20 two polypeptide chains with two non-covalently 

associated chains forming a jS-barrel domain. In each 
of the above embodiments, the polypeptide chains fold 
to form a jS-barrel domain with two or more binding 
sites. 

25 The invention also relates to the amino acid 

sequences that encode the X"P rotein / the DNA sequences 
that encode the amino acid residues forming the X" 
protein, and expression vectors comprising and capable 
of expressing the DNA sequences. The invention also 

30 relates to methods of producing the x~P rote i n - 

The invention further relates to compositions 
comprised of the x^P rote i ns and methods of use thereof. 
These compositions are comprised of x~proteins having 
two biologically active binding sites and may be used 

35 in a variety of therapeutic and diagnostic procedures. 
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These compositions include, but are not limited to, X" 
protein biosensors which undergo a conformational 
change when a ligand is bound to one binding site such 
that the affinity of the second binding site is 
5 modified; and x~P rote i ns having one binding site 

reactive with a tissue-specific ligand and the second 
binding site reactive with radioactive ions, radio- 
opaque substances, cytotoxic substances, cytotoxic 
effector cells (e.g. cytotoxic T cells) drugs, or 

10 catalytic substances. These compositions also include 
a "biochip" comprised of a two-dimensional array of 
aggregated x~P**otein biosensors, such as in a Langmuir- 
Blodgett film, to make a f unctionalized membrane useful 
for layers of molecular gates for computers and the 

15 like. 

The utility of binding proteins having two 
independent binding sites of different specificity for 
the treatment or control of tumors, virus infected 
cells, bacteria and other pathogenic states has been 

20 recognized (Segal, D.M. and Snider, D.P., Chem. 
Immunol . 47:179-213 (1989)). Bispecific binding 
proteins have been produced by crosslinking two or more 
dissimilar but intact antibodies with a chemical agent 
(heteroantibodies) ; crosslinking antibody fragments; 

25 linking two single chain antibodies; or fusing a single 
chain antibody to an effector molecule (Segal et al. 
U.S. 4,676,980), Tai, M. , et al. Biochem. 29:8024-8030 
(1990) . 

However, despite considerable application to 
30 medical research, these previous attempts to produce 

bispecific binding proteins suffer from the difficulty 
of complex purifications and large molecular size. For 
example, each binding region of an IgG is at least 50 
kD, so that a conventional crosslinked, bispecific 
35 heteroantibody can range in mass from 100-150 kD to as 
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much as 300 kD, if two intact IgG antibodies are cross- 
linked. Even the single chain antibody has a mass of 
approximately 26 kD so that a single polypeptide chain 
comprising two separate binding regions, each comprised 
5 of a single chain antibody, has a mass of approximately 
50 kD. 

The recombinant-engineered x~P rote i ns °f 
present invention have significant advantages over 
conventionally crosslinked antibodies, or even the 

10 single-chain Fv. The proteins of the present invention 
can be custom-designed to bind specific ligands and 
cell surface receptors with affinity or specificity. 
These custom-designed multivalent binding proteins can 
be smaller and more compact in size than intact hetero- 

15 bispecific antibodies, Fab* antibody binding fragments 
or bispecific sFv-sFv constructs. 

These x~P r °teins can also be less immunogenic 
thereby reducing the likelihood of immune reactions to 
such therapeutic compositions. For example, in the 

20 intact immunoglobulin molecule, the bottom loops of the 
variable region are sterically protected from 
recognition resulting in an immune response. However, 
when disassociated from the protecting constant region, 
as in the Fv, the exposed bottom loop region 

25 potentially becomes antigenic. In a x~P rote i n / this 
bottom loop region is no longer exposed and thus 
significantly reduces the likelihood of an immune 
response. Furthermore, due to their smaller size, the 
X-proteins can have enhanced stability when 

30 administered intravenously as they will be less 

susceptible to proteolysis by endogenous proteases than 
larger, multi-domain proteins. 
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Brief Description of the Drawings 

The foregoing features and advantages of the 
invention will be apparent from the following more 
particular description in the following drawings and 
5 text. 

Figure 1 is a schematic representation of a 
typical Ig Superfamily molecule, an immunoglobulin, 
depicting the variable (V) and constant (C) regions. 
Figure 2A is a schematic representation of an Fv 

10 depicting the relative positions of the top loops (TL) 
and bottom loops (BL) of the folded heavy and light 
chains. The TLs typically form the CDRs of the Fv and 
the BLs are typically adjacent to the C region when 
within an intact immunoglobulin. The BLs comprise the 

15 loops suitable for splicing on the second binding site. 
Figure 2B shows an alignment of V L and V H amino 
acid sequences for which three dimensional structures 
are known (SEQ ID NOS: 1-9 for V L and SEQ ID NOS 10-16 
for V H ) . The alignment of these sequences is based on 

20 the structural homology that exists between them, 

especially in the 0-strand regions. Regions of the 
sequence corresponding to structural regions of inner - 
0-strand (IS) , outer-jS-strand (OS) , top loops (TL) and 
bottom loops (BL) are identified. The number scale 

25 corresponds to structural position, especially in the 
OS and IS regions. 

Figure 3 A is a stereo figure depicting two copies 
of the McPC 603 Fv structure wherein the top loops of 
one (right side up) structure are superimposed with the 

30 bottom loops of the other (upside down) structure. For 
clarity, only the top loops (ribbons) and bottom loops 
(line trace) are shown. 
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Figure 3B is a stereo figure depicting the 
alignment of H2 with the N- and C-terminal strands of 
the inverted Fv structure. 

Figures 4A-4G depict in stereo the positions of 
5 the native CDRs and the CDRs spliced into the McPC 603 
bottom loops to form a additional binding site (x~ 
site). Corresponding native and x-site CDR loops are 
highlighted as ribbons. 

Figure 5 depicts the stereo comparison of the X" 
10 protein comprised of native McPC603 with McPC603 CDRs 
spliced into the BLs of the native McPC603. 

Figure 6 depicts the splice points used when H3 of 
a second McPC603 is spliced into H BL2 of the native 
McPC603 sFv. The ribbon follows the spliced trace. 
15 Figure 7 depicts the splice points used when the 

L3 loop is spliced into the L LB2 of the native 
MCPC603. 

Figure 8 depicts the splice points used when the 
HI loop is spliced into H BL4 of the native McPC603 . 
20 Figure 9 depicts the splice points used when the 

LI loop is spliced into L BL4 of the native MCPC603. 

Figure 10 depicts the splicing of the H2 loop into 
the C-terminus of V H of native MCPC603. 

Figure 11 depicts the splicing of the L2' loop 
25 onto the C-terminus of V L of native McPC603. 

Figure 12 shows the final amino acid sequence of 
the x-protein comprised of two McPC603 binding sites as 
constructed according to Examples 1 and 2 (SEQ ID NO: 
17). 

30 Figures 13 A and B show the alignment of consensus 

sequences for the various sequence classes in the Kabat 
et al. compendium with the alignment of sequences of 
known structure for the V H (Figure 13 A) and V L (Figure 
13B) Chains; (Kabat, E.A., SEQUENCES OF PROTEINS OF 
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IMMUNOLOGICAL INTEREST Vols. I-III, U.S. Dept. Health 
and Human Services, NIH Pub. No. 91-3242 (1991)). The 
residue preference list in Table 2 is derived from this 
figure. The first group of sequences are as in Figure 
5 2B (SEQ ID NOS: 1-16). The second group of sequences 
are consensus sequences from the groupings in Kabat et 
al. (Figure 13A, SEQ ID NOS: 18-28) and Figure 13B, SEQ 
ID NOS: 31-38) . The third group of sequences are known 
sequences for mouse immunoglobulins 2610 and GLOOP4. 

10 (Figure 13A, SEQ ID NOS: 29-30 and Figure 13B, SEQ ID 
NOS: 39-40)- The line labeled KABAT indicates the 
boundaries of framework and CDR regions as defined in 
Kabat et al. ("-" indicate alignment gaps). 

Figure 14 shows the sequences of x~protein 

15 constructs as constructed according to Examples 2 and 3 
(SEQ ID NOS: 41-44) . 

Figure 15A shows the nucleic acid and amino acid 
sequences of the x"l protein, with double dashes 
indicating the D1.3H1 x~loop insertion in the 26-10 sFv 

20 (SEQ ID NOS: 45 AND 46, respectively) . 

Figure 15B shows the nucleic acid and amino acid 
sequences of the x~2 protein, with double dashes 
indicating the D1.3H1 and H2 x-l°op insertions in the 
26-10 sFv (SEQ ID NOS: 46 and 47, respectively). 

25 Figure 16A shows the SDS polyacrylamide gel 

electrophoresis of the X" 1 protein purified by ouabain- 
Sepharose affinity chroma tograph . 

Figure 16B shows the SDS polyacrylamide gel 
electrophoresis of the x~2 protein purified by ouabain- 

30 Sepharose affinity chromatography. 

Figure 17 shows the elution profiles of the X"2 
protein and the 26-10 sFv from a Superdex 75 column. 

Detailed Description of the Invention 
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The present invention relates to a chimeric 
multivalent immunoglobulin (Ig) Superfamily protein 
analogue, hereinafter referred to as a x-protein, 
comprising one or more polypeptide chains forming a 0- 
5 barrel domain. The 0-barrel domain, contains 
hypervariable regions (hereinafter called 
complementarity-determining region-like (CDR-like 
region)) and structural regions (hereinafter called 
framework-region-like (FR-like region)). The CDR-like 

10 regions define ligand binding sites. Additionally, the 
X-protein has at least one more ligand binding site 
segment spliced into the FR-like regions of the 0- 
barrel domain. 

The Ig Superfamily molecules show significant 

15 amino acid sequence homology within their 0-barrel 
domains. (Williams, A.F. and Barclay, A.N., 
IMMUNOGLOBULIN GENES p. 362 (1989) ) . The amino acid 
sequences of the amino terminal domains are called 
variable (V) regions and the more conserved sequences 

20 of the remainder of the chain, termed the constant (C) 
region. (Figure 1) (Abbas, A. K. , et al. . CELLULAR AND 
MOLECULAR IMMUNOLOGY p. 45 (1991)). The amino acid 
sequences of both the V and C regions of Superfamily 
molecules are formed on two different polypeptide 

25 chains, the heavy (H) chain and the light (L) chain in 
immunoglobulins, or the a, 0, 7, 6, or e chains in 
other Ig Superfamily molecules. These chains also fold 
into V and C regions. For example, each H chain of an 
immunoglobulin molecule folds into a V H domain with an 

30 adjacent C H domain and each L chain folds into a V L 

domain with an adjacent C L . Each chain has additional 
successive constant regions (Figure 1) . 

The V region contains highly variable, unconserved 
stretches of sequence called the hypervariable regions. 
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More conserved stretches of sequence are called 
structural regions. In immunoglobulins, the 
hypervariable regions are called complementarity- 
determining regions (CDRs) and the structural regions 
5 are called framework regions (FRs) . Three CDRs of each 
V H , and three CDRs of each V L combine in a unique 
three-dimensional structure to form the antigen or 
ligand binding site. These CDRs determine the ligand 
specificity of the protein. (Abbas, A. K. , et al. . 

10 CELLULAR AND MOLECULAR IMMUNOLOGY p. 143 (1991)). 
Hereinafter, the hypervariable regions of all Ig 
Superfamily molecules will be called CDR-like and all 
conserved regions will be called FR-like (or CDRs and 
FRs in the specific case of an immunoglobulin 

15 1 molecule) . 

The Ig Superfamily members also show significant 
homology in the structural three-dimensional features 
of their V and C domains. This structural feature is 
known as the Ig-fold. (Williams, A. F. and Barclay, 

20 A.N., IMMUNOGLOBULIN GENES p. 362 (1989)). The Ig-fold 
consists of a sandwich of two 0-sheets constructed from 
anti-parallel j3-strands, each strand containing five to 
ten amino acid residues. The V domain differs from the 
C domain by an extra pair of j8-strands in the middle of 

25 the V domain. (Williams, A.F. and Barclay, A.N., 

IMMUNOGLOBULIN GENES p. 362 (1989)). For example, the 
V domains or C domains of an immunoglobulin, associate 
in pairs, such as V H °V L . Each j8-barrel domain, when 
associated in such a pair, forms two concentric 0- 

30 barrels, with the CDR loops connecting anti-parallel j3- 
strands of the inner barrel. 

Members of the Ig Superfamily include, but are not 
limited to, Immunoglobulins, T cell Receptor Complex, 
Major Histocompatibility Complex Antigens, 0 2 



VO 93/23537 



PCT/US93/04338 



-12- 

Microglobulin-associated Antigens, T Lymphocyte 
Antigens f Growth Factor Receptors, and Neural Cell 
Adhesion Molecules (NCAK) . (Williams, A. F. and Barclay, 
A.N., IMMUNOGLOBULIN GENES p. 362 (1989)). 
5 Each ligand binding site of the xjprotein is 

comprised of the CDR-like region derived from molecules 
of the Ig superfamily. For example, a ligand binding 
site could be comprised of the CDRs derived from an 
immunoglobulin molecule whose ligand is an antigen. 

10 Alternatively, the ligand binding site could be 

comprised of CDR-like regions from a receptor molecule 
such as the T cell receptor whose ligand is the 
antigen-major histocompatibility complex (MHC) molecule 
which, upon binding to its receptor, initiates T cell 

15 *. activation . 

In particular, the present invention relates to 
the immunoglobulins of the Ig Superfamily. In natural 
immunoglobulins, the antibody combining site is formed 
by CDRs of the V H and V L variable domains within the Fv 

20 (variable region consisting of noncovalently associated 
V H and V L or V H °V L ) , as shown in Figure 1. In addition 
to the CDRs determining binding specificity, 
maintenance of the tertiary structure is also necessary 
for biological activity. The CDRs are correctly 

25 positioned by the conserved framework regions (FRs) 
within the V regions, and the V region is further 
stabilized by. the C regions of the protein. However, 
the minimal naturally-occurring antibody binding site 
is the two chain, non-covalently associated Fv. 

30 Recently, through recombinant protein engineering, 

a single-chain Fv (sFv) has been constructed. In this 
approach, the genes encoding V H and V L domains of a 
given antibody are connected at the DNA level by an 
appropriate oligonucleotide linker and, on translation, 
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this gene forms a single polypeptide chain with a 
peptide linker bridging the two variable domains. 
(Huston, J.S. et aL . Meth. Enzvmol. 203:46-48 (1991)). 
Isolated single domain antibodies have also 
5 recently been constructed, comprised of only three CDRs 
instead of the usual complement of six. (Ward, E. S. , 
et al. . Nature 341: 544-546 (1989)). In some cases, 
these single domain antibodies (i.e., V H or V L ) exhibit 
binding activities comparable to their parent 

10 antibodies. 

In a preferred embodiment of the present 
invention, one of the x"P rote i n binding sites is 
comprised of a set of CDRs from a mouse myeloma protein 
such as 26-10 or MCPC603 (Huston, J.S., et al. Methods 

15 Enzvmol. 203: 46-88 (1991). An additional binding site 
segment is spliced into the 0-barrel domain. This 
spliced segment is comprised of CDRs from the same, or 
a second mouse myeloma protein, which are spliced into 
the bottom FR loops of the 0-barrel domain, or attached 

20 to the C-terminal ends of each V domain. 

The structural basis for this invention can be 
explained with reference to Figure 2B (SEQ ID NOS: 1- 
16) , where the typical variable region composition of 
an immunoglobulin is noted. There are three CDRs 

25 within each V H and V L which together constitute a 
complete antibody binding site. These CDRs are 
interspersed between FRs such that the light chain V 
region is denoted FR1-L1-FR2-L2-FR3-L3-FR4 and the 
heavy chain is denoted FR1-H1-FR2-H2-FR3-H3-FR4 . Each 

30 of these V region chains folds into a native 

conformation that comprises a double layer of 0-sheets. 
V domains may be monomer ic, (V H or V L ) , or associate 
into homodimers, (V H -V H or v L *V L ) , or into the 
heterodimeric Fv, (V H «V L ) . Each of these dimers may be 
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constructed as single-chain analogues. In the single- 
chain Fv, these two sequences are connected in tandem 
with a bridging linker to make, for example, V H -linker- 
V L (V„-V L ) or V L -linker-V„ (V L -V„) . 
5 In these folded configurations of the V region 

polypeptide, alternating directions of the 0-strands 
loop out at either end as they form anti-parallel 
strands. In each region the immunoglobulin fold has 
loops on top (the binding site is "on top") and on the 

10 bottom (partly in close proximity to constant domains 
for intact H and L chains) . 

On top of each V domain, four loops are present of 
which- three are CDRs that contribute to the antigen 
binding site, and on the bottom, four loops are present 

15 that allow the jB-strands to switch directions and fold 
back into the globular domain. These bottom loops are 
here termed BL1, BL2, BL3, and BL4, and apply to the L 
and H variable regions, respectively called L BL1-4 and 
H BL1-4 (Figure 4 A) . These bottom loops provide 

20 insertion sites for fusion proteins or peptide 

segments, which are further augmented by splicing of 
peptide sequence at the C-terminus of each V region. 
The use of these bottom loops as splice sites permits 
incorporation of alternate binding, catalytic or 

25 effector sites which complement the naturally present 
antigen binding site. 

However/ novel demands are made on variable region 
architecture in order to construct additional binding 
sites oh the bottom of a V domain. In particular, 

30 there is a requirement that the relative orientation of 
the CDR-like loops (CDR or top loop symmetry) for an Fv 
region be reproduced to a reasonable approximation in 
the relative orientation of the bottom loops of the Fv 
(bottom loop symmetry). Whether or not this symmetry 
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relationship exists is not an obvious question to ask 
and the possibility of top-bottom loop symmetry has not 
been previously recognized. Nonetheless, molecular 
modeling and computational analysis demonstrate that 
5 such an approximation to the CDR-like loop symmetry 
does exist for the bottom loops of the Fv region. 

The superposition of top and bottom loops involves 
two interrelated observations: (1) the two fixed 
endpoints of each top loop (CDR) must find a 

10 corresponding match with bottom loop endpoints; and (2) 
the polypeptide chain directionality must be maintained 
after bottom loop splicing (i.e., N to C directionality 
of the spliced CDR must be the same as the bottom loop 
that it replaces) . The overall architecture of the Fv 

15 region appears asymmetric because the ends of the light 
and heavy chain j8-barrels are closer together at the 
top than they are at the bottom. However, much of this 
apparent difference derives from there being one 
quarter of the Fv residues in CDRs on top of the 

20 framework, which fill in space that is open on the 
bottom. 

In fact, there is sufficient correlation between 
top and bottom loop symmetry of the intact Fv region to 
make the multiple binding site molecule possible. This 

25 is discernible if one superimposes the top of one Fv 
(#1) with the bottom of another Fv (#2); if #1 and #2 
are the same,' then the correlation helps decide how to 
construct a multivalent Fv, whereas if #1 and #2 are 
different, the additional binding site (x-site) is 

30 distinct from the native Fv binding site, and overall 
one thereby designs a chimeric multivalent Ig 
Superfamily protein analogue which has one or more 
sites of different specificity. 

We will represent the composition of the X" s ite 

35 with parentheses so that A(B) represents a x~protein 
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comprising the CDR-like regions of immunoglobulin 
Superfamily molecule B built into the x"site of 
immunoglobulin Superfamily molecule or analogue A, 
Likewise (V H (V L ) represents a V H domain with a x~site 
5 whose loops are derived from the CDR loops of a V L 

domain. Consequently the designation A(B) V H (V L ) -V L (V H ) 
denotes a x-P^otein based on the FR and CDR of Fv A 
having a x~site based on the CDR loops of Fv B where 
the x-site loops on the heavy chain of A are derived 

10 from the CDR loops of the light chain B and visa versa, 
and the two x domains are linked by a polypeptide 
linker such that the heavy chain FR domain precedes the 
light chain FR domain. 

As shown in Figure. 3 A,, by applying these 

15 -procedures to McPC603 for both Fv #1 and #2, the 
following pairs of loops align: 

1. The TL4 loops (L3 and H3) of Fv#l can be aligned 
very closely with the BL2 loops of Fv#2, and with 
the alignment of these two sets of loops, 

20 additional alignments become apparent. 

2. The TL1 loops (LI and HI) are approximately 
superimposed on the respective BL4 loops . 

3. Ih addition, as shown in Figure 3B, the ends of 
the TL2 loops (L2 and H2) are approximately 

25 superimposed on the N- terminal and C- terminal 

strands of the respective j3-barrel domains. 

In this structural alignment, all of the 
fundamental design criteria for splicing a second 
binding site onto the bottom of V, Fv, or sFv regions 
30 are satisfied: the proper chain directionalities and 
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superpositions are found for the TL4 - BL2 pair, the 
TL1 - BL4 pair, and the ends of the TL2 loops relative 
to the N- and C-terminal strands of the 0-barrel 
region • 

5 This construction is consistent to a good 

approximation with the natural geometry for all CDR 
loops, thereby generating a ligand binding site on the 
bottom of the 0-barrel domain similar to the "source" 
Fv binding site. Similarly, an additional ligand 

10 binding site may be built on the separate non- 

covalently linked V domains of a heterodimeric Fv 
(i.e., V H «V L ) . Alternatively, as it has recently been 
shown that the full set of 6 CDRs is not necessary for 
ligand binding, partial binding sites (i.e., fewer 

15 than 6 CDRs) may be assembled on a single V domain. 
Thus, a x-protein could comprise a single /3-barrel 
domain with two ligand binding sites, each binding site 
with as few as one CDR and still exhibit binding 
activity . 

20 H BL1 or L BL1 loop replacement could also be 

useful, as they are in proximity to the rest of the 
binding site. However, their peptide chain 
directionality is opposite to what would be needed to 
correctly splice in H2 and L2 loops. Nonetheless, such 

25 loops could be devised de novo by design or mutagenesis 
to facilitate such replacement. 

Furthermore, in most Fv regions, the H or L BL3 
loops are located on the sides of the V domains. 
Although these loops are not contiguous with the rest 

30 of the additional binding site, ancillary loops could 
be attached at such sites, thereby providing the 
addition of a "ligand-like" surface feature for 
recognition by the appropriate receptor, thus, forming 
a X"P rote i n w ith more than two binding sites. A single 



WO 93/23537 



PCT/US93/04338 



substitution of only one loop, with an appropriate 
peptide, could provide a means to anchor the binding 
protein to a particular receptor. Moreover, the H or L 
BL3 (or H or L TL3) loops could provide the means to 
5 crosslink single domain x-proteins into aggregate 
sheets to form two dimensional arrays of x-proteins. 

CDR-like region sequences of any of the proteins 
from the Ig Super family showing the requisite sequence 
and structural homology can be spliced into the bottom 
10 loops of the source Fv. It should be noted that CDR- 
like region sequences from other ig Superfamily 
molecules can also replace all or a part of the native 
CDR-like region of the x -protein. Thus a x~protein can 
comprise the CDR-like regions from molecule A, the FR- 
15 like regions from molecule B and the x _site having CDR- 
like regions from molecule C spliced in. 

As discussed in detail in Examples 1 and 2, using 
the Fv derived from mouse myeloma IgA antibody, McPC 
603, a second McPC 603 binding site can be constructed 
20 on the bottom of the source Fv. Although the Fv region 
of McPC 603 is exemplified, as shown in Example 3, it 
is reasonable to splice in a CDR-like region from other 
Ig Superfamily proteins, due to the sequence and 
structural homology that exists among these proteins. 
25 (Figure 2B (SEQ ID N0S: 1-16)). 

When CDR-like sequences are spliced into the lower 
FR-like loops of the source Fv, it is important to 
preserve those FR-like residues that are critical for 
maintaining the proper folding. These critical FR-like 
30 residues are located in the stems of the loops and 

underlying the CDR-like regions. Three criteria govern 
the selection of the proper splice points. First, as 
discussed above, the need to preserve critical FR-like 
residues; second, the desire to incorporate as much of 
35 the CDR-like sequence as possible; and third, the 
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practical need to switch from one backbone to the other 
at points where the alpha carbons of the two chains are 
reasonably well aligned. The last requirement is 
necessary in order for the spliced loops to maintain 
5 their native, biologically active conformation. 

As described in detail in Examples 1 and 2, the 
crystal lographically determined coordinates of a mouse 
myeloma protein structure, such as McPC603, are 
visually displayed using a suitable computer graphics 

10 system. This display of the protein in three- 
dimensions, can be spatially rotated, turned or twisted 
as necessary to view the polypeptide chains or peptide 
backbone of the Fv region. Using the graphics program, 
substitutions, additions or deletions can be made to 

15 the structure. These modifications are energy 

minimized (optimized) to account for steric hindrance, 
bond lengths, bond angles and energy constraints so as 
to maintain the critical tertiary structure necessary 
for biological binding activity. Thus, in a step by 

20 step process involving modification of the polypeptide 
chain and successive cycles of refinement calculations 
for each modification, a three-dimensional model of the 
X-protein, with two distinct binding sites, is built. 
As described in Examples 1 and 2 , the model used 

25 was McPC603 sFv constructed as V H -V L . However, sFv 

proteins constructed as V L -V H , V H -V H , or V L -V L are also 
intended to be encompassed by this invention. Two 
chain Fv regions (i.e., non-covalently associated V H 
and V L domains, (V H *V L ) , as well as single domain 

30 antibodies (V H or V L ) are also intended to be 

encompassed by this invention. In addition, any other 
Ig Superfamily molecule having the requisite sequence 
and structural homology are also intended to be 
encompassed by this invention. 
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Alternatively, since the sequence would be known 
for the Ig superfamily CDR-like regions of interest, 
alignment procedures that relate tertiary to primary 
structure are useful to predict successful x-P^oteins. 
5 Figures 13A (SEQ ID NOS: 10-16 and 18-30) and B (SEQ ID 
NOS: 1-9 and 31-40) depicts the amino acid residue 
alignment of Fv regions from mouse myeloma antibodies 
and other Ig superfamily molecules. As described in 
detail in Example 3 , this type of sequence alignment 

10 allows approximate choice of splice points without 

having a three-dimensional crystallographic structure. 
The tertiary structures of framework regions are highly 
conserved, thus allowing reliable three-dimensional 
models to be predicated on sequence alignment methods. 

15 However, to correctly splice "target" CDRs-like 

regions into the source Fv, some modifications to the 
source FR-like region sequence may be necessary. 
Moreover, it may be desirable to modify the native CDR- 
like sequence which is spliced into the source Fv, or 

20 the source Fv itself, to modify binding affinity or 
specificity. Such modifications are intended to be 
encompassed by the subject invention. 

For example, one or more amino acid residues can 
be substituted by another amino acid of a similar 

25 polarity which acts as a functional equivalent, 

resulting in a silent alteration. Substitutes for an 
amino acid within the sequence may be selected from 
other members of the class to which the amino acid 
belongs, such as the nonpolar, polar, and positively or 

30 negatively charged amino acids. 

The structure may be modified by deletions, 
additions, substitutions and insertions of one or more 
amino acids which do not substantially detract from the 
desired functional properties of the x~protein. 

35 Naturally occurring allelic variations and 
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modifications are included within the scope of this 
invention so long as the variation does not 
substantially reduce the ability of the x~protein to 
bind its ligand. 
5 Based on the method described in Example 3, a 

X-protein has been partially constructed incorporating 
two loops of an anti-lysozyme monoclonal antibody in 
the 26-10 sFv, as described in detail in Example 4. As 
shown in Figure 15A and 15B (SEQ ID NOS: 45-48), 

10 constructions have been made which incorporate the HI 
and H2 loops of the D1.3 anti-lysozyme monoclonal 
antibody in the appropriate x sites at the bottom of 
26-10 sFv, where they are designated HI' and H2'. The 
design process has involved successive addition of x~ 

15 loops, so that there are constructions with HI' alone 

(X-l/ Figure 15A) , and HI' + H2' together (x~2, Figure 
15B) . In constructing x~l * nd X~2, the only deviations 
from the corresponding partial constructs described in 
Example 3 (Figure 14, 2610 (Dl. 3) sequence) are the 

20 following: (1) the isoleucine at the N-terminal end of 
H2' at V H residue 112 is a valine in X" 2 due to 
limitations imposed by the restriction sites that were 
utilized, and (2) in both x~l and x~2, the linker is 
(Ser 4 Gly) 3 Ser which reasonably confers better solubility 

25 properties on the proteins than that noted in Figure 
14, Ala 5 (Gly 4 Ser) 3 . The stepwise incorporation of the 
remaining D1.3 CDRs (H3' f LI', L2', and L3') can be 
accomplished by the same procedure. 

As each CDR loop was inserted in the x~site, the 

30 cloned proteins were expressed in E. coli and refolded. 
* The recovery of ouabain binding capacity by these x~ 

proteins demonstrates that the insertion of these x~ 
CDRs was compatible with proper refolding of the 26- 
10(D1.3) sFv polypeptide chain, thereby generating a 
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26-10 digoxin binding site in each case. (Ouabain is a 
digoxin analogue with K, for association with 26-10 sFv 
of about 10 7 M* 1 / and ouabain-based affinity 
chromatography is the preferred isolation method for 
5 digoxin binding proteins. 

Thus, experimental results support the rationale 
of inserting CDR loops in place of turns in 6-sheet 
structure or near the C-terminal ends of V-regions* 
The maintenance of V region integrity after insertion 

10 of X"1°°P S is a fundamental premise for the 

construction of these dual binding site x-proteins. 
The ability to make a biologically-active xsFv protein 
(i.e., a x-protein that retains its binding activity 
due to proper conformation) with 2 inserted loops (here 

15 comprising 24 CDR residues) is a remarkable result, in 
that the phage display manipulation of sFv binding 
sites may be used in entirely novel ways to genetically 
convert these added X" s i te residues into virtually any 
desired antigen specificity. These results also 

20 reasonably support the anticipated ability to closely 

mimic a parent antibody combining site once more X"CDRs 
have been incorporated. 

The present invention also relates to the amino 
acid sequences encoding the x~P rote i n - Once the 

25 optimal splice points for CDR-like region insertion 

have been determined and shown to adequately reproduce 
the native site, the linear amino acid sequence can 
then be deduced. The x-protein, or its modified 
equivalent, can then be made by recombinant DNA methods 

30 or synthesized directly by standard solid or liquid 
phase chemistries for peptide synthesis, or other 
methods well known to one skilled in the art. For 
example, the amino acid sequence of the x~protein 
McPC603 (McPC603) shown in Figure 12 can be synthesized 
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by the solid phase procedure of Merrifield by 
semisynthesis or enzymatic or chemical combinations of 
appropriately modified blocks of peptides. 

The present invention also relates to nucleic acid 
5 sequences, DNA and RNA, that encode the x~protein, and 
expression vectors capable of expressing the x~ 
proteins. Preferably, the x~proteins of the subject 
invention will be produced by inserting DNA encoding 
the desired amino acid sequence of the x~protein into 

10 an appropriate vector/host system where it is 

expressed, or the gene may be used in a cell-free 
rabbit reticulocyte ribosomal protein synthesis system. 

A variety of host/vector systems can be used to 
express the polypeptides of this invention, such as the 

15 one described in Example 4. Primarily, the vector 
system must be compatible with the host cell used. 
Host/vector systems include, but are not limited to, 
the following: bacteria transformed with bacteriophage 
DNA, plasmid DNA or cosmid DNA, microorganisms such as 

20 yeast containing yeast vectors; mammalian cell systems 
infected with virus (e.g., vaccinia virus, adenovirus, 
etc.) or expressing plasmids; insect cell systems 
infected with virus, such as baculovirus or Xenopus 
oocytes injected with DNA encoding the x~P rote in. 

25 Other methods well known to one skilled in the art may 
also be used. 

Any of the standard recombinant methods for the 
insertion of DNA into an expression vector can be used. 
The recombinant vector can be introduced into the 

30 appropriate host cells (bacteria, virus, yeast, 
mammalian cells or the like) by transformation, 
transduction or transf ection, depending upon the 
host /vector system and cultured to express the 
polypeptides of this invention. 
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Although strategies used to produce recombinant 
proteins have often relied upon bacterial expression of 
fusion proteins followed by in vitro refolding of the 
protein , direct expression arid secretion also may be 
5 used to produce the x~P rote * ns of ^ e sub 3 ect 

invention. Expressed fusion proteins may require 
purification and possible removal of the leader 
sequence before refolding of the x-protein to recover 
binding activity. However, this may be accomplished 

10 using routine laboratory procedures. Direct expression 
of the x-protein can potentially produce the protein 
without the leader,, but still in need of refolding, 
whereas secretion can ideally produce native protein 
for subsequent isolation. (Huston, J.S-, et al. 

15 Methods Enzvmol. . 203:46-88 (1991)). 

Alternatively, fusion proteins may be the end 
product of expression. For example, a cytochrome b 5 
tail could be fused to the x-P^otein (at the gene 
level) , to make a membrane anchor that could hold the 

20 X"P rote i n i n a membrane or Langmuir-Blodgett film. A 
tail or leader sequence could also include a specific 
recognition sequence for enzymatic or chemical 
modification. Such an engineered r post- trans lational 
modification could allow specific incorporation of 

25 moieties such as biotin or phosphoinositol. 

Possible refolding protocols include dilution 
refolding, redox refolding and disulf ide-restricted 
refolding. (Huston, . J. S., et al. Methods Enzvmol- 
203:46-88 (1991) the teachings of which are hereby 

30 incorporated by reference) . Dilution refolding relies 
on the observation that fully reduced and denatured 
antibody fragments can refold on removal of denaturant 
and reducing agent with recovery of specific binding 
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activity. (Haber, E* , Proc. Natl. Acad. Sci. U.S.A. , 
53:524 (1964). 

Redox refolding utilizes a glutathione redox 
couple to catalyze disulfide interchange as the protein 
5 refolds into its native state. (Saxena, P. and 

Wetlaufer, D.B., Biochem. . 9:5051 (1971); (Huston, 
J.S., et al. Methods Enzvinol. 203:46-88 (1991)). 

Disulf ide-restricted refolding involves initial 
formation of intrachain disulfides in the fully 

10 denatured protein. This capitalizes on the favored 

reversibility of antibody refolding when disulfides are 
kept intact. (Buckley, C. E. , et al. , Proc. Natl. Acad. 
Sci. U.S.A. , 50:827 (1963); Huston, J.S., et al. 
Methods Enzvinol. 203:46-88 (1991)). Disulfide 

15 • crosslinks should restrict the initial refolding 

pathways available to the molecule. For chains with 
the correct disulfide pairing, the recovery of a native 
structure should be favored, while those chains with 
incorrect disulfide pairs must necessarily produce 

20 nonnative species on removal of denaturant. 

Characterization of the multiple ligand binding 
sites requires that both their affinity and specificity 
be determined. Ideally, the measurement of binding 
affinity should use a thermodynamically rigorous 

25 approach such as equilibrium dialysis or 

ultrafiltration. In the absence of such methods, 
agreement between two distinct methods is desirable to 
reinforce the veracity of the data. Ligand binding to 
high affinity binding sites frequently involves very 

30 fast rates of binding and slow rates of dissociation. 
This characteristic makes measurement of their 
association constants amenable to routine immunoassay 
procedures, such as immunoprecipitation, 
radioimmunoassay and ELISA. (Huston, J.S., et al. 

35 Methods Enzvmol. 203:46-88 (1991)). 
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After evaluating the binding activity of a 
particular model x~P rote i n i it ma Y be desirable to 
further refine the initial x-P^otein to enhance its 
biological activity. This refinement may be performed 
5 computationally by additional computerized modeling or 
it may be a "biological" refinement using recently 
developed techniques such as the use of genetically 
engineered bacteriophage to display and/ or secrete the 
X-protein and thereby select for an improved 

10 confirmation. (Marks, J.D., et al. . J. Mol, Biol. 
222:581-597 (1991)). 

Additional optimizing techniques, as described 
above, may also be used to confer "special" properties 
on the x-protein, such as "humanizing" the x~pr°tein 

15 using human FR-like regions as the scaffold protein to 
reduce the possibility of adverse immunologic reactions 
during therapy. (Daugherty, B. L., et al . . Nucleic. 
Acids Res. 19:2471-2476 (1991)). Special properties 
may also include increased solubility or stability or 

20 improved renaturability or secretion. 

This invention further relates to the method of 
producing the x~P rote i n which includes the steps of 
determining the splice points for additional binding 
sites, as described in Examples 2 and 3, determining 

25 the amino acid sequence of the resulting construct, 
deducing the DNA sequence encoding that amino acid 
sequence, inserting the DNA sequence into an 
appropriate expression vector and expressing the 
protein in a suitable host system, refolding the 

30 protein to its biologically active conformation and 
analyzing its biological binding activity. For 
example, in the case of a x~protein with catalytic 
activity, this would include measurement of enzymatic 
properties . 
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The ability to target therapeutic agents in a host 
with antibodies has been a long-term goal of medical 
research. The term host, as used hereinafter, is 
intended to encompass mammalian hosts, including 
5 humans. The most elegant targetable proteins would 

consist of the minimum structures needed for selective 
delivery and effector function. (Tai, M. , et al. . 
Biochem. 29: 8024-8025 (1990)). Although the utility 
of binding proteins having multiple binding sites of 

10 different specificity has been widely recognized (U.S. 
Patent 4,676,980) these crosslinked antibodies or 
crosslinked antibody fragments suffer from the need for 
complex purification schemes and large molecular size. 
Notwithstanding the construction of the smaller- 

15 sized sFv, the need still exists for a multifunctional 
binding protein with as reduced a size and as compact a 
shape as possible to permit effective therapeutic use. 
The subject invention also relates to use of these X" 
proteins in therapeutic and diagnostic procedures. 

20 These uses include, but are not limited to, in vivo and 
in vitro imaging agents, delivery agents for drugs, 
radioisotopes, and cytotoxic substances. The x~protein 
could also include a binding site for an effector 
molecule, such as an enzyme, growth factor, cell- 

25 differentiation factor, lymphokine, cytokine, hormone, 
anti-metabolitic, or an ion-sequestering sequence such 
as calmodulin. The x~P rote i n could further include a 
binding site reactive with antibody-dependent cytolytic 
cells or cytotoxic T cells. Also included are 

30 x-proteins with catalytic or biosensor activity, as 
well as binding sites that facilitate affinity 
purification procedures. 

In some instances, the primary binding site could 
be a high affinity site that targets the protein to 

35 specific cell surface locations, and the secondary site 
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designed to decrease normal receptor activity. A x~ 
protein could be designed with one binding site 
comprising a receptor for a cell surface antigen and a 
second binding site comprised of a modified receptor 
5 with diminished affinity for its ligand. Thus the x~ 
protein would to bind a cell via the normal binding 
site, leaving the binding site with diminished binding 
activity exposed, thereby resulting in decreased 
receptor activity. For example, a x"P rote i n could be 

10 designed with neural cell adhesion molecule (NCAM) 

variable regions that would modulate cell interactions 
such as contact inhibition of malignant cells. 

A x-protein may also be designed to modify, 
enhance or inhibit cell-cell interactions, or 

15 communication. For example, one binding site could be 
designed to target a cancer cell and a second binding 
site could be designed to target an effector cell, such 
as a macrophage, thus binding the macrophage in a 
manner which results in destruction of the targeted 

20 cell. 

Moreover, a x~P ro t e i n could mediate phenomena such 
as antibody-dependent cellular toxicity. (Huston, 
J.S., et al. . Proc. Natl. Acad. Sci. USA , 85:5879-5883 
(1988)). For example, a x~protein could be designed 
25 with one binding site comprised of a receptor for a 

cell surface marker protein and a second site derived 
from a receptor for killer T cells. The x~protein 
would bind to the target cell and the secondary site 
would bind the cytotoxic T cell, resulting in 
30 destruction of the target cell. The x-proteins of the 
present invention would exhibit significantly greater 
tissue accessibility or tumor penetration and faster 
pharmacokinetics due to their compact size. 

Recently, it has been shown that an Fv undergoes a 
. 35 conformational change upon antigen binding. (Bhat, T. 
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N., Nature . 347:483 (1990)). It is reasonable to 
predict, that the x"P rote i ns of the present invention 
would undergo similar conformational changes, due to 
the significant sequence and structural homology with 
5 the native Fv. Although this conformational change is 
modest, it involves the translational movement of the 
V H and V L domains relative to each other, such that the 
orientation of the bottom loops can shift. This makes 
the x~P r °tein conformation sensitive to binding at 

10 either the first or second binding site, and requires 
that linkage exists between their binding equilibria. 
It is reasonable to predict that this conformational 
change would enable the x~P^otein to act as a 
"molecular switch". 

15 For example, the first site of the X"Protein may 

be a catalytic antibody combining site (e.g., a site 
that catalyzes the conversion of a "pro-drug" to a 
cytotoxic drug) , while a second site is specific for 
binding to a marker on a cell surface (e.g., a tumor 

20 cell) . If binding to the cell surface epitope induces 
a conformational change in the x~P rote i n such that the 
loops of the catalytic site are brought into optimal 
geometry for efficient catalysis, then the pro-drug 
would be converted to cytotoxic drug directly at the 

25 site of the tumor, so that high cytotoxic levels could 
be maintained at the site while serum levels remained 
low. The unbound catalyst would have low efficiency 
and, because of its low molecular weight, would clear 
rapidly from circulation. 

30 The x~P*"Oteins are well suited to applications in 

immunotargeting because of their binding capabilities 
and compact size. The absence of constant domains will 
reduce nonspecific background binding, thus enhancing 
visualation of target tissue by in vivo or in vitro 
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imaging procedures. The use of x"P roteins in in v itr ° 
diagnostics would also reduce nonspecific binding 
thereby increasing the accuracy of these assays. 

The x"P r °teins of the present invention are also 
5 useful to deliver drugs , cytotoxic substances or 

effector molecules to immunotargeted tissues. One of 
the binding sites of the subject protein may exhibit 
catalytic activity that is triggered by binding of a 
specific ligand to the other binding site. 
10 The invention will be further illustrated by the 

following Examples, which are not intended to be 
limited in anyway. 

Example 1 

Constructing a Kodel of Bivalent McPC603 ysFv 

15 The following illustration, which utilizes the 

McPC 603 Fv region (SEQ ID NOS: 1 and 10) , for which 
the three-dimensional crystallographic structure has 
been determined, has been chosen in order to assess how 
well a particular set of splice points will generate a 

20 x-site (additional binding site) that recreates the 
parent binding site. This example also reveals the 
general method of building a x"P rote i n model, by the 
specific example of mapping the McPC603 V H CDRs to V H 
bottom loop positions, and the V L CDRs to V L bottom 

25 loop positions. Molecular modeling was performed on a 
Silicon Graphics 4D/70GT Superworkstation using the 
Biosym programs INSIGHT (to visualize models) , HOMOLOGY 
(to assemble spliced CDR/FR sequences), and DISCOVER 
(to minimize the energy of the model) (Biosym, Inc., 

30 San Diego, CA) . The HOMOLOGY program was used to 

assemble the structure in five steps, and the DISCOVER 
program was used to minimize, the energy of the 
resulting model . about the splice points. Although, 
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HOMOLOGY was used in this example, this program is not 
necessary to build the model. Any one of several 
alternative molecular mechanics programs, such as AMBER 
and CHARMM, could be used instead of the DISCOVER 
5 program. 

Two copies of this McPC603 sFv structure were 
superimposed (Figure 3A) : model A had the binding site 
up and the model B had the bottom loops up; the CDR 
loops of B superimposed on the lower loops of A ; as 
10 described above. The primed loops (e.g. H2') are those 
spliced into the x~ s i te * 



1. Building the H2' loop and the bridge. (Figures 4 A 
and 4B) . The A coordinates for the V H domain were 
used up to where H2' from structure B splices into 

15 the Oterminus of structure A and the A 

coordinates of the linker and the following V L 
domain were likewise used. The coordinates of the 
H2' segment were taken from B. Once coordinates 
had been assigned to the residues flanking the 

20 bridge peptide region, the HOMOLOGY loop search 

algorithm was used to find a 5 residue peptide 
whose flanking 3 alpha carbons overlapped well 
with the 3 amino acids on either side of the gap, 
and whose conformation best fit the surface of the 

25 V H domain. The HOMOLOGY program automatically 

connects the peptide bonds at the splice points to 
create the new single chain structure called MDL1. 
It will generally be necessary to rotate side 
groups in the interface between V H and H2' out of 

30 the way in order to reduce steric conflicts. One 

residue in H2' had to be changed because simple 
rotation could not eliminate its steric conflicts: 
Phe 137 (position H70)-> Ala, Using the DISCOVER 
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program, the atoms about the splice points were 
moved so as to minimize contributions to the 
energy of the structure due to improper bond 
lengths and geometries. At this point the 
composition of bridge peptide was polyalanine; 
additional computational analysis of the model may 
be used to further assess the optimum length, path 
and composition of the bridge. This new structure 
(with H2' and the bridge) is called MDL1. 

Building H3' and HI' Loops. (Figures 4C and 4D) 
Structure MDL1 supplied the coordinates for the 
residues preceding and following the segment where 
H3' was to be built into the model . The 
coordinates for the H3' segment were taken from 
structure B. Since HI' was a short segment of 3 
residues that would replace exactly 3 residues of 
MDL1, HI' was built by side chain substitution. 
This structure is MDL2. 

Building L3 ' . (Figure 4E) Structure MDL2 supplied 
the coordinates for the residues preceding and 
following the segment where L3' was to be built 
into the model . The segment for L3 ' was built 
into the next stage of the model in a manner 
analogous to the way that MDL2 was built. The 
resulting structure is MDL3. 

Building LI' . (Figure 4F) Structure MDL3 supplied 
the coordinates for the residues preceding and 
following the segment where LI' was to be built 
into the model.. The segment for Ll f was built 
into the next stage of the model in a manner 
analogous to the way that MDL2 was built. This 
structure is MDL4. 
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Building L2'. (Figure 4G) Structure MDL4 supplied 
the coordinates for residues preceding the segment 
where L2' was to be built into the C-terminus of 
the model. The segment for L2' was built into the 
next stage of the model in a manner similar the 
way that MDL1 was built except that no connection 
(bridge or linker) was made to the C-terminus of 
the L2' segment . In order to stabilize the C- 
terminus of the L2' loop, a disulfide bridge was 
constructed between the C-terminus of L2' and the 
segment that derived from the V L N-terminal strand 
by substituting the Arg in L2' (at the position 
corresponding to L68 in V L ) and the Thr at the 
position corresponding to L5 in V L with Cys. 

Refinement. At each stage, atoms making bad 
contacts were rotated out of the way. This often 
required breaking the peptide bond between 
neighboring residues, rotating a backbone bond and 
then reforming the peptide bond. The 
functionality for these modeling operations is 
built into the INSIGHT program and other similar 
molecular modeling programs. The resulting 
modified peptide bond was added to the list of 
splice points. Strain in the splice point peptide 
bonds was minimized by subjecting the residues on 
either side of the spliced peptide bond to 100 
cycles of steepest descent minimization using the 
DISCOVER molecular mechanics program. The final 
structure was minimized for 1000 steps of steepest 
descent minimization without coulumbic 
interactions, followed by 200 steps steepest 
descent, and 1000 steps of conjugate gradient 
minimization with columbic charge interactions. 
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Figures 4A-4G show the final model of the McPC603 
(McPC603) V H (V H )- V L (V L ) x-protein with corresponding 
pairs of loops highlighted with ribbons. The symmetry 
between the top and the bottom of the molecule can be 
5 seen. Each of the 5 pairs of spliced loops are 
highlighted in separate frames. 

A detailed comparison of the conformation of the 
new sites is presented in Figure 5. The upper view is 
of a first combining site on the top of the molecule 

10 while the lower view is the x~site built onto the 

bottom of the molecule. In the figure, corresponding 
regions of the loops are highlighted and side chain 
heavy atoms are shown. It can be seen that during the 
minimization process, there was some divergence in the 

15 conformation of the backbone and of the side chains 
between these two sites, but there still remains an 
impressive degree of homology between them. This is a 
measure of the congruency of the lower loop stems and 
the corresponding upper loop stems. The observed 

20 homology also suggests that the steric environment 

surrounding the spliced loops must be similar to that 
of the parent loops r even though it is clear that the 
loops of the first site are more exposed than are those 
of the second x~si te * 

25 Example 2 

Defining top and bottom loop symmetry and choosing 
splice points. 

Because of the high , level of symmetry that exists 
between the V H and V L subunits, CDR loops can be 

30 spliced in a "homologous" fashion (V H top loops onto V H 
bottom loops, V H (V H ), and V L top loops onto V L bottom 
loops, V L (V L )) or in a "heterologous" fashion (V L top 
loops onto the bottom of V H/ V H (V L ) , and V H top loops 
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onto the bottom of V L , V L (V H )). In addition, there are 
two ways to connect the V H and V L regions: 1) linker 
peptide bridging from the C-terminus of V H to the N- 
terminus of V L (V H - linker - V L ) , or 2) linker peptide 
5 running from the C-terminus of V L to the N-terminus of 
v h ( v l " linker - V H ) . There are, therefore, four 
possible types of x~P rote i n derived from immunoglobulin 
CDRs. 

If one is splicing the CDRs of one V H into the 
10 bottom loops of another V H (as in Example 1) then H2 

would be spliced into the V H C-terminal strand. The C- 
terminal end of the new H2' segment would end up on the 
outside of V H region, near its N-terminal strand 
(Figure 3) . Thus, if one is splicing heavy chain loops 
15 into the bottom of the heavy chain in a sFv connected 
in the V H - V L order, the C-terminal end of H2' can be 
connected to the peptide that links the heavy and light 
chains. The L2 loop structure can likewise be spliced 
onto the C-terminus of V L . 

20 Modeling a V H (V H ^ " V L (V L ) Y-Protein based on McPC603 : 
The following discussion presents a detailed 
analysis of how to construct a second McPC603 antigen 
binding site on the bottom of MCPC603 to form a homo- 
domain V H -linker-V L x~protein. Although this 

25 exemplification is based on McPC603, the structural 

homology that exists between the framework regions of 
known Fv structures clearly suggests that CDR splicing 
to the bottom loops of Fv regions will work with any 
antibody Fv. The existing structural homology between 

30 known structures permits one to structurally align the 
sequences of these known structures and to establish a 
numbering index wherein the position numbers in the 
strand regions identify unique structural locations. 
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The position scale used to refer to splice points is 
defined in Figure 2B. 

Determining loop splice points 

When the CDR loops are spliced into the lower 
5 loops, it is important to preserve those framework 
residues bordering both the top loops and the bottom 
loops which are critical for maintaining the proper 
folding of the 0 -barrel structure. In the following 
figures, the spatial alignment of bottom loops with the 
10 superimposed CDR loops are shown along with the segment 
of aligned sequence that derives from the chosen splice 
points. Three considerations govern the choice of the 
splice points: 

1. Preservation of critical framework residues. 



15 



2. Maximal incorporation of CDR sequences. 
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3. Fusion at closet possible splice points. In order 
for the spliced loops to maintain their native 
structure in the new site, it is important that 
the chains be distorted as little as possible at 
5 the splice sites. Therefore one tries to chose 

splice points as close as possible to the 
positions where the aligned chains are closest 
together, e.g., where corresponding alpha carbons 
are separated by no more than a few angstroms. 



10 Determining the Splice Points for Splicing H3 (H-TL4 ) 
into H-BL2 . and L3 (L-TL4 ) into L-BL2 : 

Figures 6 and 7 show how the CDR 3 regions (H3 and 
L3) would be spliced into H-BL2 and L-BL2 , 
respectively. Critical framework residues H-BL2 and L- 

15 BL2 include the Gin at position H39 in H-BL2 and the 
Gin at position L45 in L-BL2 . 

Referring to Figure 6, residues position H102 to 
H114 from H3 (positions H101 to H115) can be spliced 
into H-BL2. The H3 residue at position H101 is not 

20 included because it would replace the critical 

framework residue at position 39. The H3 residue at 
position H115 is not included because splicing from 
position H114 in the H3 loop to position H47 in the 
framework leads to less distortion of both the 

25 framework and the loop conformations. 

The L3 loop (positions L97 to L104) can be spliced 
into L-BL2 in a similar manner. Referring to Figure 7, 
the new L3' loop includes residues from positions L99 
to L104. The L3 residues at position L97 to L98 are 
* 30 not included because L98 would replace the critical 

framework residue at position L45. The two chains are 
sufficiently close together at L3 position L104 that 
the entire C-terminal end of L3 can be spliced into the 
new L3' loop. Consequently, only one residue from the 
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N-terminal end of L3 cannot be spliced. In the cases 
of both the H3' and the L3' splices, the backbones of 
the respectively aligned chains lie close enough 
together at the indicated splice points that there is 
5 little distortion of the backbone conformation in the 
X-protein. 

Determining the Splice Points for Splicing HI (H-TL11 
into H-BL4. and LI fL-TLl) into L-BL4 : 

Figures 8 and 9 show how the CDR1 regions would be 
10 spliced into the fourth bottom loops. Because sections 
of both H-BL4 and L-BL4 that contain . non-critical 
residues do not align well with the CDR1 portion of 
either CDR loop, only portions of the CDRl loops can be 
x spliced. 

15 The HI segment (positions H31 to H35) is 5 

residues long in McPC603, and of these, the 3 residues 
(positions H31 to H33) that can be spliced are on the 
outer portion of H-TL1, which faces the binding site. 
Position H92 in H-BL4 is a critical residue that is 

20 involved in a conserved intrachain salt bridge, while 
positions H96 to H98 are conserved inner strand 
residues. Consequently positions H31 to H33 are 
spliced between positions H92 and H96 in H-BL4 to 
create HI' . 

25 in McPC603, LI (positions L25 to L41) is fairly 

long (17 residues) , but only the internal 10 residues 
(position L30 to L39) can be spliced (Figure 9>. 
Position L30 is the first place where the LI loop and 
L-BL4 align. Framework positions L93 to L95 are 

30 conserved , inner strand residues that are structurally 
analogous to H96 to H98. In order to preserve 
framework position L93 the LI loop is spliced at 
position L39. Consequently, 10 residues from the 
McPC603 LI region can be spliced into L-BL4 . In other 
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antibodies having shorter LI regions, only about 3 or 4 
residues can be spliced. 

Determining the Splice Points for Splicing H2 (H-TL2) 
into the C-terminal end of VH : 
5 Figure 10 shows how the H2 region of the inverted 

copy of McPC603 aligns with the N-terminal and the C- 
terminal ends of V H . Residues positions H48 to H50 
that flank the N-terminal end of H2 loop region 
(positions H50 to H70) align closely with V H C-terminal 

10 residue positions H117 to H119, respectively. When 

splicing in the H2' loop, one wants to preserve as much 
of the V H C-terminus as possible* Based on the 
superposition of the C-terminal and H2' backbones, 
splicing can be done between positions H115 and H119 in 

15 the C-terminal strand and between positions H47 and H51 
in N-terminal flanking sequence of H2 . Because several 
residues in the C-terminal sequence of V H (and V L as 
well) are probably important to stabilizing the packing 
of BL1 and BL4, the splice should be made so as to 

20 preserve as much of the C-terminus as possible. As 
shown in Figure 10, therefore, the splice is made 
between position H119 of the V H C-terminal strand and 
position H51 of the H2' loop. The C-terminal end of 
H2' (arrow A, Figure 10) is located on the surface of 

25 V„, near the N-terminal strand of V H , when H2' is 

properly positioned relative to H3', L3', HI' and LI'. 
To connect this end of the N-terminal end of the inter- 
chain linker used to create an single chain Fv (arrow 
B, Figure 10) , a 5 residue bridge peptide must be added 

30 (between points B and A in Figure 10) . In this 

arrangement, the linker-bridge structure folds across 
the bottom of the V H chain. The preferred bridge 
peptide would contain Ser for solubility and non-polar 
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residues to help anchor it to the surface of V H so as 
to stabilize the conformation of the H2' fold. In the 
model and in the sequence shown in Figure 12 , a 
polyalanine bridge has been used. 
5 Figure 11 shows how the region of -the inverted 

copy of McPC603 aligns with the N-terminal and C- 
terminal ends of V L . Residue positions L49 to L57 that 
flank the L2 loop region (positions L57 to L63) align 
closely with V L C-terminal residue positions L102 to 

10 L109. The same considerations are involved in making 
the L2' splice as with the H2' splice. The splice is 
made from C-terminal position L108 to L2 position L56- 
The L2 loop structure folds around until it runs 
parallel to the V L N-terminal strand. In order to 

15 stabilize the L2' loop structure, we include a 

disulfide bond from the end of the L2' loop to a 
residue in the N-terminal strand of V L . To accomplish 
this the Arg at position L68 in the C-terminal end of 
the L2 loop and the Thr at position L5 in the N- 

20 terminal strand of V L were changed to Cys. 

Figure 12 shows the sequence (SEQ ID NO: 17) of 
the resulting construct with the sequences of the 
primary CDR sites (bold) and secondary CDR sites 
(Underlined) identified. 

25 Example 3 

Determining y-Site Splice Point s From Primary Sequence 

Alignment 

The members of the Ig Super family share a high 
degree of structural homology- By using the homology 
30 that exists between members of known structure, it is 
possible to engineer x~proteins from Ig Superfamily 
molecules whose sequence is known but whose tertiary 
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structure is not yet determined by crystallographic or 
NMR methods. 

Figure 13A (SEQ ID NOS: 10-16 and 18-30) and 13B 
(SEQ ID NOS: 1-9 and 31-40) shows the set of sequences 
5 of Fv domains for which structures are known and are 
available from the Brookhaven Protein Databank (PDB at 
Brookhaven National Laboratory) . 

Table 1 lists the sequence and structure 
identification names, along with the Kabat et al . 

10 classification and references for the sequences that 

appear in Figures 2B (SEQ ID NOS: 1-16) and 13A (SEQ ID 
NOS: 10-16 and 18-30) and 13B (SEQ ID NOS: 1-9 and 31- 
40) . The alignment of these sequences is based on the 
structural homology that exists between them, 

15 especially in the 0-strand regions (OS and IS) . 

Because of this high degree of structural correlation, 
one can assign positions in a "consensus 3-D model" to 
residues in a set of aligned j8-barrel sequences. 
Furthermore, at certain positions, there exists a 

20 strong preference for certain residues or for residues 
having certain physical characteristics (Table 2) . 
Thus, while the 3-D structures of most Fv sequences 
remain undetermined experimentally, the residue 
preferences at these positions makes it possible to 

25 reliably align these sequences with those V or Ig 

Superfamily sequences for which the 3-D structures have 
been experimentally determined. Ig V regions can 
thereby be aligned with the sequences of more distant 
members of the immunoglobulin superfamily, such as the 

30 T-cell receptor which was successfully modeled 

according to the McPC603 Fv structure (Novotny, J. et 
al. Proc. Natl. Acad. Sci USA 88:8646-8650 (1991). 
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Once a sequence has been positioned in relation to 
the set of structurally aligned V region sequences, one 
15 can use the splice point analysis developed in Example 
2 to construct a model and delineate a sequence for a 
X-protein by analogy. The procedure for determining 
proper slice points is as follows: 



1. Sequences of undetermined 3-D structure are 
20 lined up with the set of structurally aligned 

sequences. The sequence alignment process makes 
use of the conserved and semi-conserved residue 
positions listed in Table 1 to fix the alignment 
in certain regions. Between these anchor points 
25 it may be necessary to introduce gaps. Sequences 

in the 0-strand regions (OS and IS) generally line 
up without the need for introducing gaps except in 
the N-terminal section of the light chain variable 
region (L 0S1 - L BL1) . Relative to all other 
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light chain V region classes, kappa light chain 
class V sequences have a deletion at position L7 
and an insertion at position L17. Lambda light 
chain sequences also have a deletion at position 
5 L7 , but no insertion at position L17 . Kappa light 

chain class VII sequences have a deletion at 
position L23. Other than these gaps in the N- 
tenninal section of the light chains, the 
alignment of sequences is may make use of 

10 conserved positions to establish anchor points. 

Further help in the alignment of intervening 
sequences is derived from conservation of 
sidecha in properties (i.e., polar, non-polar, 
charged, acid, basic, etc) at certain positions. 

15 The other major gaps will exist in the CDR loop 

regions (HI, LI, H2, L2, H3 and L3) . Because CDR 
loops are variable in sequence and length, the 
exact location of gaps in these regions is less 
critical as the sequences in these regions show 

20 little homology. Structural variation tends to be 

the largest at the center of CDR loops of similar 
sequence, so gaps are preferably positioned in the 
center of CDR top loops. 

2. Once the sequences of one or more V regions 
25 of undetermined structure have been lined up with 

the structurally aligned reference set, structure 
positions are assignable to the residues that 
locate in the regions of the inner and outer /?- 
strands. Because the x^site splice points occur 
30 near the ends of the /?-strand regions, they can be 

assigned by analogy with the splice positions 
determined for McPC 603 in Example 2. Given two 
sequences aligned with the reference set, the 
primary sequence is that into whose bottom loops 
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the x~site will be built, and is referred to below 
as the target sequence; while the secondary 
sequence contains the CDR loops which will be 
built into the x~site, and which will be referred 
5 to below as the source sequence. x~site loops are 

created by splicing top loop segments from the 
source sequence in place of bottom loop segments 
in the target sequence. For a V H (V H ) -V L (V L ) x~ 
protein construct similar to that designed in 

10 Example 2, the spliceable portions of source CDRs, 

flanked by the fixed top loop j3-strand end points, 
are labelled in Figure 13 as Ll'(S), L2'(S), 
L3(S), H1'{S) and H3' (S) , while the target bottom 
loop segments, flanked by the fixed bottom loop 0- 

15 strand end points, are labelled in Figure 13 as 

Ll'(T), L2'(T), L3'(T), HI' (T) , H2'(T) and H3'(T). 
Thus, to create a X" s i te L3' loop in the L BL2 
loop of the target sequence, one would splice the 
residues identified as L3'(S) in Figure 13 to the 

20 residues flanking the segment identified as 

L3'(T); similarly, to create a x~site HI' loop in 
the H BL4 loop of the target sequence, one would 
splice the residues identified as H3 / (S) in Figure 
13 to the residues flanking the segment identified 

25 as Hl'(T), and so forth. These splices are done 

to create a V H (V H ) • V L (V L ) x~protein as in Example 
2. To create a V H (V L ) • V L (V H ) x-protein L3' loops 
in the H BL2 loop of the target sequence, one 
would splice the residues in the segment 

30 identified as L3' (S) in Figure 13 to the residues 

flanking the segment identified as H3'(T). 
Likewise, to create a X" s i te HI' loop in the L BL4 
loop of the target sequence, one would splice the 
residues flanking the region identified as HI' (S) 
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in Figure 13 to the residues flanking the segment 
identified as LI' (T) , and so forth. 

3. Alternatively, one can build structural 
models for the sequences of undetermined structure 
5 using the coordinates of one or more of the known 

structures as a basis for the framework residue 
locations . Programs such as the Biosym HOMOLOGY 
(Biosym, Inc. , San Diego, CA) program are designed 
to aid in this process. The procedure is well 

10 established in the literature (Grear, J . (1991) 

Meth. in Bnzvmol. 202:239-252 (1991)). Once 
constructed, models of the parent Fv, sFv or V 
region and of the Fv regions of the corresponding 
X-site can thus be used to determine splice points 

15 and to construct a model of the x~protein in the 

manner set forth in Examples 1 and 2. Building a 
model provides insight into possible steric 
conflicts, particularly, as was noted in Example 
2, at the interface between H2' and the surface 

20 residues of the Fv domain. 

Figure 14 shows the sequence of x~P rote i n R19.1 
(D1.3) (SEQ ID NO: 42) as it would be constructed 
following parts 1 and 2 of the example using the 
aligned sequences in Figure 14. The framework and 
25 primary CDR loops are taken from R19.9 while the 

sequences for the x~si te loops were taken from the 
H1*(S), H2'(S), H3'(S), Ll'(S), and L3'(S) regions of 
D1.3* 

Figure 14 also shows the sequences of x~P**otein 
30 MCP(MCP) (McFC603(McPC603) (SEQ ID NO: 41)) as it would 
be constructed in Examples 1 and 2. Also shown are 
sequences of x~proteins 26-10 (Dl. 3) (SEQ ID NO: 43) and 
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26-10 (GL00P4) (SEQ ID NO: 44) as they would be 
constructed using the method of Example 3. 



TABLE 2 



T *i rrh+- 

Chain 
Position 


KeSluUe 
Preference 


Heavy 
Chain 
Position 


Residue 
Preference 


LI 


D, (E) 


HI 


E, (D,Q) 


L2 


I 


H2 


V 






H3 


Q,K,H 


L4 


M,L 


H4 


L 


L5 


T 






L6 


Q 


H6 


E(Q) 


L7 


S,T, (-) 


H7 


S, (T,P,-) 


L8 


P, (T, E) 










Hll 


L, (V) 






H12 


V, (M) 






H14 


P, (A) 


LI 6 


G(L) 


H15 


G, S 


LI 7 


-,G 






L20 


V/ (A) 


H18 


L,V, (I) 


L21 


T(S) 


H19 


K,R, S 


L22 


I, (M,L) 


H20 


L,I,M 


L23 


S,T,(-) 


H21 


S,T 


L24 


C 


H22 


C 


L27 


S, (T) 










H25 


S,T 






H26 


G 






H27 


Y,F, (D,T,-) 






H28 


polar 






H29 


F,I,L 






H30 


T,S, (D) 
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Light 
Chain 
Position 


Residue 
Preference 


Heavy 
Chain 
Position 


Residue 
Preference 


L40 


L.M.V, (A) 






L42 


w 


H36 


W 


1*43 


y, rv,I) 


H37 


V, (I,M) 




o 


H38 


K,R 






H39 


Q. (K) 


L46 


K 










H41 


P, (H) 






H42 


G, (E) 






H43 


K,Q,N,R 


L51 


P. (V.FJ 


H4 5 


L 


L52 


K, (R,Q) 


H4 6 


E, (D) 






H47 


W, (Y,H,D) 






H48 


I< (M,L) 


L54 


L, (W) 






L55 


If (V) 


H49 


G, (A) 


L56 


y, (g,k) 










H51 


I, (V,S) 


L59 


s 










H62 


Y, (T) 


L64 


G 






L65 


V, (I) 


H66 


F, (L,V) 


L66- 


Pr(S) 


H67 


K, (Q,M) 


L68 


R 


H69 


K,R, (L) 


L69 


F 






L70 


S,(T) 







' WO 93/23537 



PCT/US93/04338 



-51- 
TABLE 2 CONT 



Light 
Chain 
Position 


Residue 
Preference 


Heavy 
Chain 
Position 


Residue 
Preference 


L71 


G, (A,V) 


H72 


F,I,L,V 


L72 


S 


H73 


S,T 


L74 


S 


H75 


D, (N) 


L75 


G 






L76 


T 


H77 


S, (T,A,P) 






H78 


polar 






H79 


S,N 






H80 


(-) 






H81 


A,L,V 






H82 


Y, (F,H) 


L80 


L, (F) 


H83 


L, (M) 






H84 


Q, (D, E,K) 


L82 


I 


H85 


L,M,I 


L83 


polar 


H86 


S,N, (D, R) 


L85 


V,M,L, (A) 


H88 


L, (V) 


L86 


Q,E 






L88 


E 


H91 


E, (D, A) 


L89 


D 


H92 


D 






H93 


T,S 


L91 


A, (G) 


H94 


A, (G) 


L93 


*,(H) 


H96 


Y 


L94 


Y,F 


H97 


Y,F 


L95 


C 


H98 


C 


L96 


Q, (A,S) 


H99 


A 






H114 


D, A 
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Light 
Chain 


Residue 
Preference 


Heavy 
Chain 
Position 


Residue 
Preference 


Tine 


T CP V\ 
■»■ r l"f v J 




- 




* t \*l 


H116 


W 


L108 


G 


HI 17 


G 






H118 


Q/A 


L110 


G 


H119 


G 


Llll 


T 


H120 


T 


L112 


K 






L113 


L, (V) 


H122 


V, (L) 


L114 


E, (T,Q,K) 


H123 


T 


L115 


I, (L,V) 


H124 


V 


L116 


K 


H125 


S 



1) Single letter amino acid codes; letters in 
parentheses signify less common preferences. 

2) "-" signifies the occurrence of a gap. 

3) Each row corresponds to an equivalent 
location in the V H and V L chains. 

EXAMPLE 4 

Construction, Expression and Evaluation of a v-Protein 

The x~i and X" 2 genes (Figure 15A and B, SEQ ID NOS: 
45 and 47) were prepared by mutagenesis of the 26-10 sFv 

10 gene as described in (Huston, J.S., et al. , Proc. Natl. 
Acad. Sci. USA , 85:5879-5883 (1988); Tai, M.-S., et aL, 
Biochemistry 29:8024-8030 (1990)) and were incorporated 
into the pET vector described in Studier, F.W. , and Moffat 
B.A. , J. Mol. Biol. 189:113 (1986) behind a T7 promoter. 

15 Upon transformation of E. coli with this vector, direct 



Notes : 



5 
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expression produced each in the form of cytoplasmic 
inclusion bodies. Cells were treated with lysozyme to 
allow cell lysis and ultracentrif ugal isolation of the 
inclusion bodies, which were then dissolved in 6 M 
5 guanidinium chloride containing 10 mM dithiothreitol, 25 mM 
Tris, and 10 mM EDTA at pH 8.1; the solution was incubated 
overnight at room temperature. The protein was then 
diluted into 3 M or 3.5 M urea buffer containing 25 mM 
Tris, and 10 mM EDTA, and a glutathione redox couple (1 mM 

10 oxidized, 0.1 mM reduced) at pH 8.1. Following 18 h. at 
4°C, each was fully renatured by dialysis into phosphate 
buffered saline (PBS), consisting of 0.05 M potassium 
phosphate, 0.15 M NaCl, pH 7.0, and 0-03% NaN 3 . The X"l 
and X"2 solutions were then passed through ouabain- 

15 Sepharose columns, washed first with PBSA (PBS +0.03% 
NaN 3 ) , then with 1M NaCl in PBSA to remove any unbound 
material from the columns, and elution effected by 
displacing specifically bound protein with 20 mM ouabain in 
PBSA. The affinity purified x~l and x~2 proteins were then 

20 examined by SDS polyacrylamide gel electrophoresis. Figure 
16A and B shows the x~l (SEQ ID NO: 4 6) and x~2 (SEQ ID NO: 
48) polypeptide chains following their affinity isolation. 
In figure 16B, the oxidized X"2 (upper band) is compared 
with 26-10 sFv (lower band) in lanes denoted as mixture. 

25 Sequence analysis of the x"2 protein verified that the 
insertions noted in the gene sequence (Figure 15B) were 
present in the protein sequence CNBr fragments of the 
protein were made and sequenced as a mixture in an Applied 
Biosystems 470A gas-phase sequencer equipped with a model 

30 120A on-line analyzer. The gel (Figure 16B) is also 

consistent with the increased molecular weight of the x~2 
over the 26-10 sFv. Refolded protein that bound to the 
ouabain-Sepharose column has necessarily regained active 
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antibqdy combining sites for digoxin-like cardiac 
glycosides typified by ouabain. 

Additional insights into the shape and properties of 
the X"l and x~2 proteins are apparent from Superdex 75 size 
5 exclusion chromatography of the affinity * pur if ied x 

proteins in comparison to the 26-10 sFv, as shown in Figure 
17. The single HI' insertion of x~l adds two tyrosyl 
residues within the GYGY sequence, resulting in a 
pronounced tendency to dimerize and a very skewed profile 

10 indicative of dissociation into monomer (data not shown). 
In contrast r data shown in Figure 17 for x~2 (top panel) 
indicate that it appears perfectly behaved in solution, 
devoid of any apparent dimer. Thus, as one incorporates a 
multiplicity of x~CDRs in V regions, the known 

15 hydrophobicity of CDR sequences apparently is contained by 
the aggregate of x~CDR conformation and interaction. The 
added HI' and H2' loops necessarily increase the protein's 
Stokes radius, resulting in its elution position being 
between the 26-10 monomer and dimer positions (bottom 

20 panel) . A mixing experiment that combines both proteins in 
a single chromatographic separation (middle panel) 
indicates that the X" 2 shows no apparent interaction with 
the 26-10 monomer and dimer species, as the middle profile 
appears to be a simple additive composite of the top and 

25 bottom chroma tograms . The peaks beyond 30 minutes 
(horizontal axis is in minutes) are simply injection 
artifacts on the HPLC system, probably due to buffer 
differences between the column and sample. 

Equivalents 

30 Those skilled in the art will recognize, or be able to 

ascertain using no more than routine experimentation, many 
equivalents to the specific embodiments of the invention 
described herein. Such equivalents are intended to be 
encompassed by the. following claims. 
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CLAIMS 

The invention claimed is: 

1. A chimeric multivalent immunoglobulin (Ig) Superfamily 
protein analogue comprising one or more polypeptide 
chains forming a j3-barrel domain containing 
complementarity-determining region-like (CDR-like) 
regions and framework region-like (FR-like) regions, 
said CDR-like regions defining a ligand binding site 
and said protein analogue having at least one 
additional ligand binding site segment spliced into 
the FR-like regions of said j3-barrel domain, 

2. A chimeric multivalent Ig Superfamily protein analogue 
of Claim 1 wherein said polypeptide chains have an 
amino acid sequence wherein said sequence is 

15 substituted or modified in the amino acid sequence of 

at least one amino acid residue. 

3. A chimeric multivalent Ig Superfamily protein analogue 
of Claim 1 wherein a non-covalently associated two 
chain polypeptide forms a 0-barrel domain. 

20 4. A chimeric multivalent Ig Superfamily protein analogue 
of Claim 1 wherein a single chain polypeptide forms a 
j3-barrel domain. 

5. A chimeric multivalent Ig Superfamily protein analogue 
of Claim 1 comprising a single chain polypeptide 
25 forming a j3-barrel domain wherein said single chain 

polypeptide is comprised of two polypeptide chains 
connected by a polypeptide linker spanning the 
distance between the C-terminus of one chain to the N- 
terminus of the other chain. 



5 



10 
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A chimeric multivalent Ig Super family protein analogue 
of Claim 1 wherein the polypeptide chain is selected 
from the group consisting of: heavy chain (H) , light 
chain (L) , a chain (a) $ chain (0) , y chain (7), 6 
chain (£) , or e chain (e) . 

Biological material having a nucleotide sequence which 
encodes a chimeric multivalent Ig Superfamily protein 
analogue of Claim 1. 

A replicable recombinant DNA expression vector 
containing the nucleotide sequence of Claim 7. 

A chimeric multivalent antibody analogue comprising 
one or more polypeptide chains forming a j8-barrel 
domain containing complementarity determining regions 
(CDRs) and framework regions (FRs) , said CDRs defining 
an antigen binding site, said antibody analogue having 
at least one additional antigen binding site segment 
spliced into FRs of said j3-barrel domain. 

A chimeric multivalent antibody analogue of Claim 9 
wherein a non-cbvalently associated two chain 
polypeptide forms a j3-barrel domain. 

A chimeric multivalent antibody analogue of Claim 9 
wherein a single chain polypeptide forms a j3-barrel 
domain. 

A chimeric multivalent antibody analogue of Claim 9 
wherein said CDRs and FRs are comprised of heavy chain 
(H) polypeptide chains and light chain (L> polypeptide 
chains derived from variable regions (V) of 
immunoglobulin proteins. 



* WO 93/23537 PCT/US93/04338 



-57- 

13. A chimeric multivalent antibody analogue of Claim 12 
wherein the CDRs spliced into FRs of the jS-barrel 
domain to form an additional binding site segment such 
that a variable heavy chain (V H ) CDR is spliced into a 

5 V H FR to form a V H (V H ) polypeptide chain. 

14. A chimeric multivalent antibody analogue of Claim 12 
wherein the CDRs are spliced into FRs of the j8-barrel 
domain to form an additional binding site segment such 
that a variable light chain (V L ) CDR is spliced into a 

10 4 V L FR to form a V L (V L ) polypeptide chain. 

15. A chimeric multivalent antibody analogue of Claim 12 
wherein the CDRs are spliced into the FRs of the )8- 
barrel domain to form an additional binding site 
segment such that a V H CDR is spliced into a V L FR to 

15 form a V L (V H ) polypeptide chain. 

16. A chimeric multivalent antibody analogue of Claim 12 
wherein the CDRs are spliced into the FRs of the 0- 
barrel domain to form an additional binding site 
segment such that a V L CDR is spliced into a V H FR to 

20 form a V H (V L ) polypeptide chain. 

17. A chimeric multivalent antibody analogue of Claim 9 
comprising, a single chain polypeptide forming a /3- 
barrel domain wherein said single chain polypeptide is 
comprised of two polypeptide chains connected by a 

25 polypeptide linker spanning the distance between the 

C-terminus of one chain to the N-terminus of the other 
chain. 
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A chimeric multivalent antibody analogue of Claim 17 
wherein said two polypeptide chains connected by a 
linker further comprise two V H (V H ) , V L (V L ) , V H (V L ) or 
v l( v h) polypeptide chains. 

A chimeric multivalent antibody analogue of Claim 18 
wherein to the N-terminal end of the polypeptide 
linker spanning the distance between the C-terminus of 
one polypeptide chain to the N-terminus of the other 
polypeptide chain is added a polypeptide residue 
bridge which connects the N-terminal end of the linker 
to the C-terminal end of a CDR sequence which has been 
added to the C-terminal end of a FR sequence. 

The polypeptide linker and bridge of Claim 19 
comprising at least 19 amino acid residues. 

A chimeric multivalent antibody analogue of Claim 9 
wherein said CDRs and FRs are of mammalian origin. 

A chimeric multivalent antibody analogue of Claim 9 
wherein said CDRs and FRs are of mouse myeloma origin. 

Biological material having a DNA sequence which 
encodes the chimeric binding protein of Claim 9. 

A replicable recombinant DNA expression vector 
containing the DNA sequence of Claim 23. 

A chimeric multivalent Ig Superfamily protein analogue 
of Claim 1 wherein, upon the binding of a ligand to 
one binding site, a conformational change is initiated 
in the 0-barrel domain such that the affinity of the 
second binding site is modified. 
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26. A chimeric multivalent Ig Superfamily protein analogue 
of Claim 1 wherein an additional polypeptide effector 
molecule having a biological activity is linked to the 
N- or C-terminus of said /3-barrel domain said 

5 biological activity independent of the ligand binding 

activity of the chimeric multivalent protein analogue. 

27. A chimeric multivalent Ig Superfamily protein analogue 
of Claim 1 wherein one binding site is reactive with a 
diagnostic imaging agent. 



10 28. A chimeric multivalent Ig Superfamily protein analogue 
of Claim 1 wherein one binding site is reactive with a 
r a d i o i s o t op e . 

29. A chimeric multivalent Ig Superfamily protein analogue 
of Claim 1 wherein one binding site is reactive with a 

15 cytotoxic substance. 

30. A chimeric multivalent Ig Superfamily protein analogue 
of Claim 1 wherein one binding site is reactive with 
an effector molecule. 

31. A chimeric multivalent Ig Superfamily protein analogue 
20 of Claim 1 wherein one binding site is reactive with a 

marker on a cytotoxic cell. 

32. A method for imaging specific tissue in a host 
comprising: 

a) administering to a host a chimeric multivalent Ig 
25 Superfamily protein analogue of Claim 1 having 

one binding site reactive with a targeted tissue 
specific antigen and a second binding site 
reactive with a diagnostic imaging agent under 
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conditions wherein said protein analogue binds to 
the targeted tissue; and 
b) administering the imaging agent to the host under 
conditions whereby said imaging agent binds to 
5 the chimeric multivalent Ig Superfamily protein 

analogue resulting in a detectable image of the 
targeted tissue. 



33, A pharmaceutical composition for administration to a 
host for imaging specific tissue in a host comprising 
10 the chimeric multivalent Ig Superf amily protein 

analogue of Claim 32 in a pharmaceutical^ acceptable 
carrier* 



34. A method of irradiating specific tissue in a host 
comprising: 

15 a) administering to a host a chimeric multivalent Ig 

Superfamily protein analogue of Claim 1 having 
one binding site reactive with a targeted tissue 
specific antigen and a second binding site 
reactive with a radioisotope under conditions 

20 whereby said protein analogue binds to the 

targeted tissue; and 
b) administering the radioisotope to the host under 
conditions whereby said radioisotope binds to the 
chimeric multivalent Ig Superfamily protein 

25 analogue wherein binding of said chimeric protein 

analogue to targeted tissue and binding of said 
radioisotope to said chimeric protein analogue 
results in irradiation of the targeted tissue. 
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35. A pharmaceutical composition for administration to a 
host for irradiating specific tissue in a host 
comprising the chimeric multivalent Ig Superfamily 
protein analogue of Claim 34 in a pharmaceutically 

5 acceptable carrier. 

36. A method of delivering a cytotoxic substance to 
specific tissue in a host comprising: 

a) administering to a host a chimeric multivalent Ig 
Superfamily protein analogue of Claim 1 having 

0 one binding site reactive with a targeted tissue 

specific antigen and a second binding site 
reactive with a cytotoxic substance under 
conditions whereby said protein analogue binds to 
the targeted tissue; and 

5 b) administering the cytotoxic substance to the host 

under conditions whereby said cytotoxic substance 
binds to the chimeric multivalent Ig Superfamily 
protein analogue, wherein binding of said 
chimeric protein analogue and binding of said 

0 cytotoxic substance to said chimeric protein 

analogue results in delivering the toxic 
substance to the targeted tissue. 



37. A pharmaceutical composition for administration to a 
host to deliver a cytotoxic substance to specific 
tissue in a host comprising the chimeric multivalent 
Ig Superfamily protein analogue of Claim 36 in a 
pharmaceutically acceptable carrier. 
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38. A method of lysing target cells in a host having 

cytotoxic cells comprising administering to a host a 
chimeric multivalent Ig Superfamily protein analogue 
of Claim 1 having one binding site reactive with a 
5 surface receptor of a cell targeted to be lysed, and a 

second binding site reactive with a marker on a 
cytotoxic cell under conditions whereby said protein 
analogue binds to the targeted cell and said cytotoxic 
cell binds to the chimeric multivalent Ig Superfamily 
10 protein analogue, wherein binding of said chimeric 

protein analogue and binding of said cytotoxic cell 
results in lysis of the targeted cell. 



39. A pharmaceutical composition for administration to a 
host a cytotoxic cell which lyses a target cell 

15 comprising the chimeric multivalent ig Superfamily 

protein analogue of Claim 38 contained in a 
pharmaceutical ly acceptable carrier. 

40. A method of modifying the function of a cell surface 
receptor of specific tissue in a host comprising 

20 administering to a host a chimeric multivalent Ig 

Superfamily protein analogue of Claim 1 having one 
binding site reactive with a targeted cell surface 
receptor and a second binding site reactive with an 
effector molecule under conditions whereby said 

25 protein analogue binds to the targeted tissue and said 

effector molecule binds to the chimeric multivalent Ig 
Superfamily protein analogue, wherein binding of said 
chimeric protein analogue and binding of said effector 
molecule results in selective modification of the 

30 function of the targeted cell surface receptor. 
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41. A pharmaceutical composition for administration to a 
host an effector molecule which modifies the function 
of a cell surface receptor of a specific tissue 
comprising the chimeric multivalent Ig Superfamily 

5 protein analogue of Claim 4 0 contained in a 

pharmaceutically acceptable carrier. 

42. The chimeric multivalent Ig Superfamily protein 
analogue of Claim 1 having one binding site reactive 
with a preselected ligand and a second binding site 

10 reactive with a substance labeled with a radioisotope 

or enzyme suitable for use as a quantifying agent in 
an in vitro diagnostic assay. 

43. The chimeric multivalent Ig Superfamily protein 
analogue of Claim 1 having one binding site reactive 

15 with a preselected ligand and a second binding site 

having catalytic activity. 

44. The chimeric multivalent Ig Superfamily protein 
analogue of Claim 43 wherein both binding sites have 
catalytic activity. 

20 45. The chimeric multivalent Ig Superfamily protein 
analogue of Claims 43 and 44 contained in a 
physiologically compatible carrier solution for 
administration to vertebrates. 

46. The chimeric multivalent Ig Superfamily protein 
25 analogue of Claim 1 crosslinked at sites other than 

ligand binding sites to form a two dimensional array 
of chimeric multivalent protein analogues. 
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47. A method for producing a chimeric multivalent Ig 

Superfamily protein analogue comprising the steps of: 
a) . determining the splice points for CDR-like 

regions to form additional ligand binding site 
5 segments on the FR-like regions of a 0-barrel 

domain whereby insertion of CDR-like region amino 
acid residues into the FR-like region residues 
maintains the folded structure required for 
binding activity with a preselected ligand; 
10 b) determining the amino acid sequence of the 

resulting construct having a first ligand binding 
site and a second ligand binding site; 
c) deducing the DNA sequence encoding the amino acid 
sequence of b) ; 
15 d) synthesizing the DNA sequence; 

e) inserting the DNA sequence into an appropriate 
expression vector and expressing the polypeptide 
in a suitable host system; 

f ) isolating and purifying the expressed 
20 polypeptide; and 

g) refolding the purified polypeptide to its 
immunologically reactive conformation, thereby 
resulting in a chimeric multivalent Ig 
Superfamily protein analogue. 

25 48. The method of Claim 47 wherein determining the splice 
points is accomplished computationally by use of a 
computer-generated three dimensional structure of the 
chimeric multivalent Ig Superfamily protein analogue. 



49. 

30 



The method of Claim 47 wherein determining the splice 
points is accomplished by primary sequence alignment. 
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50. A method for effecting cell-cell interactions 
comprising administering to a host a chimeric 
multivalent Ig Superf amily protein analogue of Claim 1 
having one binding site reactive with a targeted cell 

5 surface receptor of a first cell and a second binding 

site reactive with a targeted cell surface receptor of 
a second cell under conditions whereby said protein 
analogue binds to said first cell and second cell 
wherein binding of said first cell and second cell to 
10 the chimeric protein analogue results in interaction 

between the two cells. 

51. A molecular switch comprising a chimeric multivalent 
Ig Superfamily protein analogue of Claim 1 having one 
binding site initiating a conformational change in 

15 said chimeric protein analogue when said binding site 

is bound to ligand, whereby the conformational change 
causes the chimeric protein analogue to act as a 
molecular switch. 

52. A chimeric multivalent Ig Superfamily protein analogue 
20 according to Claim 1 for use in therapy or diagnosis. 

53. An analogue according to Claim 52 for use in (a) 
imaging specific tissue in a host; or (b) irradiating 
specific tissue in a host; or (c) delivering a 
cytotoxic substance to specific tissue in a host; or 

25 (d) lysing target cells in a host; or (e) modifying 

the function of a cell surface receptor of specific 
tissue in a host; or (f) effecting cell-cell 
interactions in a host. 
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54. Use of a chimeric multivalent Ig Superfamily protein 

analogue according to Claim 1 for the manufacture of a 
diagnostic agent for imaging specific tissue in a 
host. 

5 55. Use of a chimeric multivalent Ig Superfamily protein 

analogue according to Claim 1 for the manufacture of a 
medicament for (a) irradiating specific tissue in a 
host; or <b) delivering a cytotoxic substance to 
specific tissue in a host;, or (c) lysing target cells 
10 in a host; or (d) modifying the function of a cell 

surface receptor of specific tissue in a host; or (e) 
effecting cell-cell interactions in a host. 
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